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Abstract. The ‘pet fish’ phenomenon is often cited as a paradigm ex¬ 
ample of the ‘non-compositionality’ of human concept use. We show here 
how this phenomenon is naturally accommodated within a compositional 
distributional model of meaning. This model describes the meaning of a 
composite concept by accounting for interaction between its constitnents 
via their grammatical roles. We give two illustrative examples to show 
how the qualitative phenomena are exhibited. We go on to apply the 
model to experimental data, and finally discuss extensions of the formal¬ 
ism. 


1 Introduction 


A question in cognitive science is to characterise how humans understand expres¬ 
sions involving the composition of many concepts. The principl e of semantic com- 
positi onality is often attributed to Frege, and is summarized by Kamo and Part^ 
[1995 1 as follows: 

The meaning of a complex expression is a function of the meaning of its 

parts and of their syntactic mode of combination. 

One approach to understanding the meaning of a complex expression is to model 
concepts as sets, and the meaning of a complex expression as determined by var¬ 
ious operations on its parts. Some combinations of concepts such as ‘red car’ can 
be understood in this way, by taking the intersection of things that are red and 
things that are cars, whereas others such as ‘red wine’ cannot be characterised 
as a straightforward intersection, because red wine is not a standard red colour. 
This failure of co ncepts to behave con j uncti vely is showcased via the ‘pet fish’ 
problem, given in lOsherson and SmithI 19811 as a counterexample to a fuzzy set 
model of concepts. The problem is as follows. Concepts, as used by humans, are 
characterised as fuzzy sets, with the typicality of an item x to the concept L 
given by a membership function rriL^x). A goldfish is considered not to be a very 
typical pet, nor a very typical fish, say rripetigoldfish) = mfishigoldfish) = 0.5. 
However the goldfish is a typical pet fish, say rupet fish(goldfish) = 0.9. This 
contradicts rules of conjunction in fuzzy set theory, in which the membership 
of an item in the intersection of two fuzzy sets is given by the minimum of its 
membership in either one. The phenomenon 


rriAABix) > inm{mA{x),mB{x)) 














is called overextension and 


mAyB{x) < ma.x{mA{x),mB{x)) 


underextension. 


Thi s and similar phenomena have been experimentally verified in iHamoton 
1988al f9|. The data collected is a series of typicalit y ratings for con cepts and 


; y ratings 1 

their combinations. Further evidence is reported in iHamotonl 1987j| . in which 
the importance of attributes in conjunctions of concepts is dependent on the 
importance of the same attributes in the conjuncts. In addition to the phenomena 
of over- and underextension, non-commutativity of membership in conjunctions 
is exhibited, together with attributes becoming more or less important depending 
on the order of combination. These phenomena are ver y puzzling if concepts 
are conceived of as fuzzy sets. Kamo and Partee 1995j themselves provide a 
thorough analysis of these phenomena, and propose a model which circumvents 
some of these problems. However, it is not able to account for the ‘pet fish’ 
phenomenon. 

In order to solve the problems encountered by the fuzzy set theoretic view of 
concep ts, quantum theory i nspired m odels of concepts a r e developed in, amongst 
others , Aerts and Gabora 2005al f9| , Aerts et ^ 20I3l| , IPothos and Busemeveij 


2012] . These models view concepts as subspaces of a vector space. Similarity of 


one concept to another is measured by the projection of one concept onto an¬ 
other. The combination of two concepts is modelled as a tensor product. The ‘pet 
fish’ phenomenon then arises when the representation of the combined concept 
is entangled. The combination of more than two concepts may also be effected 
simply by taking the tensor product of the constituent concept spaces. These ap¬ 
proaches, however, require that the representational space for a concept grows in 
size as more elements are added to the compound. This means that compounds 
will become unwieldy, and more importantly that compound concepts cannot be 
directly compared with their constituents. 

It is then possible to investigate how well these models characterise human 
concept combination by examining the probability of agreeing with certain state¬ 
ments, or rating similarity between c oncepts. A review of such phe nomena is 
given i n Pothos and Bnsemeveil 2012 |. Data such as that collected by Hamotoi] 


1988allb I is a rgued to provide evidence for quantum structure in human thought 


Aertsl.l20n9] . since the similarity ratings do not adhere to classical probability 


theory. This failure to adhere to classical probability theory is then termed ‘non- 
compositionality’. However, whilst these approaches can provide a modelling of 
the data, they do not give an explanation of the fact that humans do successfully 
combine concepts. Further, the role of syntax is ignored. We will argue that the 
compositional nature of human thought can be explained, in a way that retains 
the qualitative phenomena discussed, and in a way that maps meanings to one 
sh ared space. 


Clark and PulmanI 2007 represent the meaning of words as vectors, and 


use the tensor product to combi ne words and their grammatica l roles into sen¬ 
tences, using ideas developed in ISmolenskv and Legendre 200 (t] . However, this 

























































again has the limitation, explicitly recognised, that sentences and word s of dif¬ 
ferent compositional types may not be compared. Coecke et al. 2010l| in part 


inspired by the high-level des cription of quantum information flow introduced in 
Abramskv and Coeckel 2004 1. utilise grammar in order to use composite spaces 
without increase in size of the resulting meaning space. This is the model that we 
will use to explain these compositional phenomena. The meaning of a composite 
concept, or a sentence, arises from the meaning of its constituents, mediated by 
grammar. This allows composite concepts to be directly compared with their 
constituents, and the meaning of sentences of varying length and type to be 
compared. In this paper we will apply this compositional distributional model 
of meaning to the ‘pet fish’ problem, and show how the qualitative phenomena 
of overextension and non-commutativity naturally arise - i.e., we show how a 
goldfish can be a good ‘pet fish’, even though it is neither a typical pet, nor a 
typical fish, and within this we also explain why a ‘pet fish’ is different to a ‘fishy 
pet’. We provide a modelling of experimental data, and go on to discuss how a 
fuller account of human concept use will be developed. 

The remainder of the paper is organised as follows. In section!^ we describe 
the compositional distributional model of semantics. In section[3]we describe how 
the ‘pet fish’ phenomenon may be modelled and give two illustrative examples. 
We go on to model experimental data in section 01 and discuss the results and 
further work in section [S] 


2 A Compositional Distributional Approach to Meaning 


The account of meaning of Coecke et al. 201Cll| , unifies compositional accounts of 
meaning, where the meaning of a sentence is seen as a function of the meanings 
of the individual words in the sentence, and distributional accounts of meaning, 
where the meaning of individual words are characterised as vectors. The main 
source of word meanings in distributional semantics is to derive word meanings 
from text corpora via word co-occurrence statistics and a review of techniques. 
Other methods for deriving such meanings may be carried out. In particular, 
we can view the dimensions of the vector space as attributes of the concept, 
and experimentally determined attribute importance as the weighting on that 
dimension. The compositional account and the distributional account are linked 
by the fact that they share the common structure of a compact closed category. 
This allows the compositional rules of the grammar to be applied in the vector 
space model to map syntactically well-formed strings of words into one shared 
meaning space. 

In the remainder of this section, we firstly describe pregroup grammar, used 
for the compositional rules. We then how the reductions of the grammar may 
be mapped to finite dimensional vector spaces, and go on to give an example 
sentence reduction. We then describe a further type of structure, Frobenius alge¬ 
bras. These have been used to model information flow in sentences using relative 
pronouns such as ‘which’. We may therefore use these structures to model the 
sentence ‘pet which is a fish’. 

















2.1 Pregroup Grammars 


In order to charact erise the c ompo sition of concepts, the model uses Lambek’s 
pregroup grammar [Lambekl . Il999l| . A pregroup (P, <, •, 1, (—)^ (—)'^) is a par¬ 
tially ordered monoid (P, <, •, 1) where each element p G P has a left adjoint p* 
and a right adjoint p’’, that is, the following inequalities hold: 

P* • P < 1 < P • P* and p ■ < I < ■ p 

The pregroup grammar then uses atomic types, such as n, s, their adjoints 
s'"..., and composite types which are forming by concatenating atomic types 
and their adjoints. We use the type s to denote a declarative sentence and n to 
denote a noun. A transitive verb can then be denoted rf sn}. If a string of words 
and their types reduces to the type s, the sentence is judged grammatical. The 
sentence ‘James shoots pigeons’ is typed n {rf sn}) n, and can be reduced to s 
as follows: 

n {rpsri^) n < 1 • sn^n < 1 • s • 1 < s 

However, this symbolic reduction can also be expressed graphically as follows: 


James shoots pigeons 


n 




In this graphical representation, the elimination of types by means of the 
inequalities n-rP < 1 and n* • n < 1 is represented by a ‘cup’ while the fact that 
the type s is retained is represented by a straight wire. 


2.2 Grammatical Reductions in Vector Spaces 


We give here a brief description of how the reductions of the pregroup gram- 
mar may be mapp ed into the category of vector spaces. For full details, see 


Coecke et al. 2010l |. 


The mapping uses the fact that both pregroups and vector spaces are in¬ 
stances of compact closed categories. We map the atomic types n, s of the 
pregroup grammar to vector spaces N, S. In the category of vector spaces over 
R, only one adjoint N* of N exists and it is isomorphic to N, and hence n’' and 
also both map to V. Similarly s’", s’’ map to S. 

The concatenation of types to form composites is mapped to the tensor 
product 0. So the sentence ‘James shoots pigeons’, is typed nrf sri}n in the 
pregroup grammar, and is represented in the tensor product of vector space 


N®N®S®N®N. 













The reductions : nn’" < 1, e* : n^n < 1 are mapped to the linear extension 
to the tensor product of the inner product: 

e : -/V 0 TV ^ R :: it i ® it j n- j {it^ | it^) 

ij ij 


and type introductions rj^ '■ 1 < n/'n, rj^ '■ 1 < rni} are implemented as 
r] ■.'R ^ N <Si N :: In- i®~^ i 


These are all presented mathematically rigorous ly in ICoe cke et a l.l 1201011 and 
an introduction to relevant category theory given in ICoecke and Paquette 2011 1. 

Now, individual words are assigned meanings within the vector space. This 
might be implemented via statistical analysis of text corpora, or attribute weight¬ 
ings elicited in an experiment. Nouns are represented as vectors in a single space 
TV, whereas other word types are represented in larger spaces. For example, tran¬ 
sitive verbs are typed rf sn^ and are therefore represented in the rank 3 space 
N ® S ® N. The reductions of the pregroup are implemented as inner products, 
and if the sentence is grammatical, one sentence vector G S should result. 
Meanings of sentences may then be compared by computing the cosine of the 
angle between their vector representations. So, if sentence A has vector repre¬ 
sentation and sentence B has representation their degree of synonymy 
is given by: 

sim(A,B) = ^4^^ (1) 




We now give an example sentence reduction. 


An example sentence To express the sentence ‘James shoots pigeons’, we 
firstly give types to the words in the sentence, so as before we have n(n’'sn*)n. 
The reduction of the sentence is e’’ 0 Is 0 Moving to the vector space, we 
represent nouns within a vector space TV, sentences within a vector space S. A 
transitive verb is then represented in the tensor product space N ® S (E) N. 

So if we represent shoots by: 

shoots = ^ (8) j 

ijk 


then we have: 

James shoots pigeons = ejv ® Is 0 €N{James ® shoots ® pigeons) 
= ^^Cijk {James\~^i) 0 j 0 {~^klpigeons) 

ijk 

-EE Cijk {James\~^i) k\pige.ons) j 

j ik 
























Strings of words may also reduce to other types, such as nouns. An adjective 
can be given the type nn}, and then a phrase such as ‘red car’ is typed nn’‘n < n. 
Another type of word that has been well studied is the possessive pronoun. In 


Sadrzadeh et al.l 20131, the authors analyse the possessive pronoun as utilising 


a Frobenius algebra. 


2.3 Frobenius Algebras 


We state here how a Frobenius algebra is implemente d within a vecto r spac e 


2013t . 


over R. For a mathematically rigorous presentation see ISadrzadeh et al.l 
A vector space V over R with a fixed basis has a Frobenius algebra given 

by: 


A :: it i it i ^ it i i. :: it i 1 :: it i ^ it t Sijlt t 

i 

This algebra is commutative, so for the swap map a : X Y ^ X, we have 

a o A — A and /i o cr = /i. It is also special so that jjLo A = 1. Essentially, the 
/i morphism amounts to taking the diagonal of a matrix, and A to embedding a 
vector within a diagonal matrix. This algebra may be used to model the flow of 
information in noun phrases with relative pronouns. 


An example noun phrase In ISadrzadeh et al.l |2013j , the authors describe 


how the subject and object relative pronouns may be analysed. We describe 
here the subject relative pronoun. The phrase ‘James who shoots pigeons’ is a 
noun phrase; it describes James. The meaning of the phrase should therefore 
be James, modified somehow. The word ‘who’ is typed rfns^n, so the sentence 
‘James who shoots pigeons’ may be reduced as follows: 


James 

n rf 



who shoots pigeons 

r ] 71 

n s n n s n 



Sadrzadeh et al.l |2013l | go on to show that this may be reduced to 


{fjLN ® i-s ® eN)iJames 0 shoots ^pigeons) 
This gives the result: 


James who shoots pigeons = James © ( shoots x pigeons) 


( 2 ) 


























where shoots is a matrix representing the verb ‘shoot’, x refers to matrix mul¬ 
tiplication and © refers to elementwise multiplication. 

We have given a brief overview of the key aspects of the model of meaning. 
These ideas will now be used to give an account of the ‘pet fish’ phenomenon, 
firstly by recognising that ‘pet’ in ‘pet fish’ functions as an adjective, and sec¬ 
ondly by analysing the expression ‘pet which is a fish’. 

3 A Compositional Distributional Account of the ‘Pet 
Fish’ Phenomenon 

We will use the compositional distributional model of meaning to give an account 
of the ‘pet fish’ problem. The fuzzy-set theoretical view of the problem sees the 
types of the words ‘pet’ and ‘fish’ as the same, namely that both should be viewed 
as nouns and the word ‘pet fish’ formed from their intersection. However, the 
word ‘pet’ in this context is undeniably an adjective. Within pregroup grammar, 
it is typed as nn\ and therefore within the model, this is viewed as a matrix 
pet = ® The meaning of ‘pet fish’ is therefore 



The problem is essentially that an item such as a goldfish may not be a typical 
‘fish’, nor a typical ‘pet’, but a very typical ‘pet fish’. To measure this within 
our model, we use the cosine similarity of meaning vectors sim{i, ~^) given in 
equation © as a proxy for typicality. We may justify this by remarking that 
typicality may be characterised as as a function of similarity. Given two word 
vectors, e.g. dog and pet, the vector representing ‘dog’ should have a higher 
similarity to the concept ‘pet’ than, for example ‘spider’ should, i.e. we should 
have: 



spider ■ pet 



= sim{spider, pet) 


We now examine the effect of concept combination on cosine similarity, and 
see that the ‘pet fish’ phenomenon is reproduced. 

3.1 Creating Adjectives 

Adjectives, verbs, adverbs and various other grammatical types may be viewed as 
operators. An adjective, in particular, is a matrix in TV® A, and the application 
of an adjective to a noun within the framework is matrix multiplication of the 










noun by the adjective. It is simple to craft a matrix pet that transforms the 
vector fisfi such that: 


sim 


{pet X fish, goldfish) > sim{fish, goldfish) 


where x refers to matrix multiplication. 


However, this process may be simplified even further. iKartsaklis et al.l [2013 1 


characterise an n-ary operator by the sum of the words it takes as arguments. 
For example ‘pet’ is ht where the fit are words that co-occur with ‘pet’ such 
as ‘dog’, ‘cat’, ‘spider’ and so forth. Each word may occur more than once in this 
sum to give a frequency calculati on. This gives a t ensor whose rank is one less 
than the desired rank. Therefore, IKartsaklis et al.l 2013j suggest expanding the 
rank using the Frobenius copy operator A. The meaning of an adjective-noun 
combination such as ‘pet fish’ may then be calculated as 


pet fish = '^^Pij~^i j\fish) 


If the Frobenius copy operation and the inner product are both carried out 
with respect to the same basis, we obtain 


pet fish = petadj O fish„ 


Although the vectors petadj and petnoun are likely to be similar, they will 
not be identical. This allows the combinations ‘pet fish’ and ‘fishy pet’ to be 
different. We will see that this method of forming adjectives can reproduce the 
qualitative phenomena required, and it is notable that we can do this without 
expanding the rank of the data space. 

Another way of expressing the combination of the two concepts ‘pet’ and 
‘fish’ is by the ex tended ‘fish which is a pet’, or ‘pet which is a fish’. In the 
experiments from Hamotonl 


this is how the two combinations 
are phrased. One of the key findings from these experiments was typicality of an 
item to each ordering is not identical. In order to express ‘fish which is a pet’, 
we use the verb ‘to be’ as a transitive verb. These are typed as N ^ S ^ N and 
can also be expressed as a sum of their arguments, i.e. as iti 0 itj where 
iti is the subject of the verb and itj the object. Here we have discarded the 
sentence type S. In the treatment of the relative pronoun ‘which’, the sentence 
type is also discarded, and therefore the verb matrix can be used directly. We 
can express the combination ‘fish which is a pet’ analogously to the expression 
given in equation [51 


fish which is a pet = fish © (is x pot) 


where is is the matrix of the verb ‘to be’. 

In the next section we show that these two methods of combination give us 
some of the qualitative properties we need for the ‘pet fish’ phenomenon. 





























3.2 Toy Model Using Adjective Vectors 

This toy model applies the way of forming an adjective as a sum of noun vec¬ 
tors. Suppose that the nouns ‘dog’, ‘cat’, ‘goldfish’, ‘shark’, ‘pet’ and ‘fish’ have 
attribute weights as presented in table [TJ Each attribute is weighted as given by 
the entries in the table. These weights are hypothetical weights, but could be 
elicited from humans in an experiment, or from text corpora. We can form the 
adjective ‘pet’ by forming a superposition of ‘dog’, ‘cat’, and ‘goldfish’: 

petadj = dog + -I- goldfish = [2.5, 0.6,1,1, 2.7]^ 

The cosine similarity of each animal to the nouns ‘pet’, ‘fish’, and ‘pet fish’ 
are shown in table [2] The qualitative phenomenon that a goldfish is a better 
example of a pet fish than of either a pet or a fish is exhibited. 


Table 1: List of concepts and attributes for adjective vector model 



pet 

fish 

goldfish 

cat 

dog 

shark 

cared-for 

1 

0.2 

0.7 

0.9 

0.9 

0 

vicious 

0.2 

0.8 

0 

0.2 

0.4 

1 

fluffy 

0.7 

0 

0 

0.9 

0.7 

0 

scaly 

0.2 

1 

1 

0 

0 

1 

lives in the sea 

0 

0.8 

0 

0 

0 

1 

lives in house 

0.9 

0.3 

0.9 

0.9 

0.9 

0 


Table 2: Cosine similarity for adjective vector model 



goldfish 

cat 

dog 

shark 

pet (noun) 

0.7309 

0.9816 

0.9809 

0.1497 

fish 

0.5989 

0.2500 

0.3292 

0.9670 

pet (adj) fish 

0.9377 

0.5524 

0.6197 

0.5861 


3.3 Toy Model Using Relative Pronouns 

We give here a toy model for the composition of two concepts using Frobenius 
multiplication, and show that the qualitative attributes we require are exhibited. 
Suppose we again have attributes as listed in the rows of table [T] for the concepts 
as given in the columns. 

Weights for the matrix for the verb ‘to be’ are given in table [3] below. These 
weights express the extent to which the attributes co-occur. For example, the 
extent to which something is vicious is also cared-for is given weight 0.02. We 


















assume that the extent to which an attribute is itself, for example, the extent to 
which a vicious thing is vicious, is greater than the extent to which it is anything 
else. Hence the diagonal is emphasised. Ideally, the verb ‘to be’ would interact 
directly with the attribute space as does the relative pronoun, as this will form 
a future line of enquiry. 


Table 3: Matrix of weights for the verb ‘to be’ 



cared-for 

vicious 

fluffy 

scaly 

water 

house 

cared-for 

1 

0.02 

0.08 

0.03 

0.02 

0.09 

vicious 

0.02 

1 

0.05 

0.06 

0.05 

0.02 

fluffy 

0.08 

0.05 

1 

0 

0 

0.09 

scaly 

0.03 

0.06 

0 

1 

0.05 

0.02 

lives in the sea 

0.02 

0.05 

0 

0.05 

1 

0 

lives in house 

0.09 

0.02 

0.09 

0.02 

0 

1 


Using this representation of the verb ‘to be’ and the grammatical structure 
of relative pronouns, we have the following: 

pet which is a fish = Q{isx fisk) = [0.29, 0.18,0.06, 0.22,0.22,0.35]^ 
and 

fish which is a pet = fisk Q{isx ^t) = [0.23, 0.24, 0,0.27,0.27,0.32]^ 

The similarity of each of the vectors for ‘goldfish’, ‘cat’, and ‘shark’ to the con¬ 
cepts ‘pet’, ‘fish’, ‘pet which is a fish’ and ‘fish which is a pet’ is given in tablelH 
We see that the similarity of ‘goldfish’ to ‘pet which is a fish’ and ‘fish which is 
a pet’ is higher than its similarity to either ‘pet’ or ‘fish’. In addition, similarity 
of each animal to ‘pet which is a fish’ is not identical to its similarity to ‘fish 
which is a pet’. 


Table 4: Cosine similarity for relative pronoun model 



goldfish 

cat 

shark 

pet 

0.7309 

0.9816 

0.1497 

fish 

0.5989 

0.2500 

0.9670 

pet which is a fish 

0.8999 

0.7783 

0.4467 

fish which is a pet 

0.8898 

0.6540 

0.5730 


4 Modelling Attribute Combination 

In iHamotonl [l987l | , Hampton collects data on the importance of attributes in 
concepts and their combination. Concepts are considered in pairs that are related 























to some degree, for example ‘Pets’, and ‘Birds’. Six pairs are considered in total, 
detailed below. Participants are asked to generate attributes for each concept 
and for their conjunctions, where conjunction in this case is rendered as ‘Pets 
which are also Birds’, or ‘Birds which are also Pets’. For example, attributes such 
as: ‘lives in the house’, ‘is an animal’, ‘has two legs’, are generated for ‘Pets’, 
‘Birds’. For each pair of concepts and their conjunction, attributes that had been 
generated by at least 3 out of the 10 participants were collated. Participants were 
then asked to rate the importance of each attribute to each concept and to each 
conjunction. Importance ratings were made on a 7 point verbal scale ranging 
from ‘Necessarily true of all examples of the concept’ to ‘Necessarily false of all 
examples of the concept’. Numerical ratings were subsequently imposed ranging 
from 4 to -2 respectively. 

The question then arises of how the importance of attributes in the conjunc¬ 
tion of the concepts is related to the importance of attributes in the constituent 
concepts. Key phenomena are that conjunction is not commutative, that inheri¬ 
tance failure can occur (i.e., an attribute that is important in one of the concepts 
is not transferred to the conjunction), that attribute emergence can occur, where 
an attribute that is important in neither of the conjuncts becomes important in 
the conjunction, and further, that necessity and impossibility are preserved. In 
order to model this data, Hampton uses a multilinear regression. 

We use the importance values for each attribute and their conjunction to 
determine a set of weights for the verb ‘to be’, which, when substituted in to the 
phrase ‘A which is a B’, or ‘B which is an A’, provides the appropriate attribute 
weights. We require a matrix is such that: 

^ © (is X il) = 


and 

0 (is X ~X) = bA 

where stands for the combination ‘A which is a B’ and vice versa. The 
importance values are rated on a scale 4,3,2,!,—1,-2. We map these into a 
[0,1] interval by r i—(r -|- 2)/6, where r is the relevant rating. 

To fit parameters, we use MATLAB’s fmincon with the active-set algorithm 
and constraint that verb entries must be greater than 0. Results are reported 
in tables [5] and [6l F-HA stands for ‘Furniture and Household Appliances’, F-P 
‘Foods and Plants’, W-T ‘Weapons and Tools’, B-D ‘Buildings and Dwellings’, 
M-V ‘Machines and Vehicles’, B-P ‘Birds and Pets’. Modelling the combina- 


Table 5: Cosine similarity measure for multilinear regression (MLR) and for the 
compositional distributional (CD) model, for the ordering ‘A which is a R’ 


Cosine similarity 

F-HA 

F-P 

W-T 

B-D 

M-V 

B-P 

MLR 

0.9905 

0.9944 

0.9791 

0.9839 

0.9948 

0.9727 

CD 

0.9996 

1.0000 

0.9997 

0.9996 

1.0000 

0.9999 










Table 6: Cosine similarity measure for multilinear regression (MLR) and for the 
compositional distributional (CD) model, for the ordering ‘i? which is a A’ 


Cosine similarity 

F-HA 

F-P 

W-T 

B-D 

M-V 

B-P 

MLR 

0.9898 

0.9965 

0.9930 

0.9945 

0.9948 

0.9673 

CD 

0.9997 

1.0000 

0.9996 

0.9997 

1.0000 

0.9999 


tion of the pairs of concepts using the grammatical attributes of the phrase ‘A 
which is a B’ allows for a greater accuracy in modelling the conjunction. In the 
tables above, we have used the same matrix for the verb ‘is’ in each ordering. 
It is unsurprising that we are able to obtain a good fit to the data, since we 
are using weights in the verb ‘to be’ for 2k datapoints, and the results we 
show here are therefore not of any statistical significance. However, by using the 
verb ‘to be’ in this way, we are able to take account of how attributes interact 
with one another. Further work will examine how the matrices thus obtained re¬ 
flect Hampton’s findings regarding the necessity and impossibility of attributes, 
attribute emergence and inheritance failure. 


5 Discussion 


We h ave described a compositional distributional model of meaning [Coecke et al 
2010l| , which utilises grammar to describe how the meaning of a compound arises 
from the meaning of its parts. This account of meaning is inherently composi¬ 
tional, and importantly, the meanings of composites inhabit the same space as 
their constituents, so that the inner product may be used to compare concepts di¬ 
rectly. We apply this model to the ‘pet fish’ problem, describing how the phrases 
‘pet fish’ and ‘pet which is a fish’ may be modelled within the formalism, and 
giving two illustrative models which show how the quali tative phenomen a are 
naturally produced. We go on to model a set of data from Hampton 1987l| . This 
highlights how attributes in concepts interact, via the verb ‘to be’. 

The approach we have outlined contrasts with approaches in the quantum 
cognition literature departing from an assumption of non-compositionality. The 
claim of non-compositionality in the literature refers to the fact that human 
concept combination and judgements cannot be modelled using classical proba¬ 
bility. Instead, we take a semantic approach. Within the model of meaning that 
we describe, the meaning of a sentence is rendered as a vector. The meaning of 
individual words in the sentence or noun phrase are given by vectors or tensors, 
and a method for combining them is specified. We therefore view the meaning 
of a complex expression as being exactly specified by the meaning of its parts 
and of how they are combined. The ‘non-compositional’ phenomena described 
may well have a description that is compositional when both meaning and gram¬ 
mar are taken into account, and in particular we have argued that the ‘pet fish’ 
example may be viewed as compositional. 

Further work will extend the account to other cognitive phenomena. We can 
straightforwardly apply the modelling of ‘pet fish’ to account for the conjunction 





















fallacy iTverskv and Kahnemap 1983l| , since this is an example of overexten- 
sion. Another o ft-cited phenomenon is the asymmetry of similarity judgements 
Tverskvl 1977 |. in which, for example, Korea is judged more similar to China 


than China is to Korea. T here are two approa ches we might take to modelling 


this. Firstly, as detailed in Coecke et al.l 20ld ]. we can choose a graded truth- 


theoretic space as the sentence space. Then, the sentence ‘Korea is similar to 
China’ can be modelled as mapping to a higher value than ‘China is similar to 
Korea’. 


Alternatively, a fuller account of the ways in which concepts interact can be 
developed. Whilst synonymy is useful as a proxy for membership, and of course 
in comparing meaning, we must develop measures th at allow the description 
of various relationships between concepts. iBalknl 20141 uses the non-symmetric 
measure of relative entropy to characterise hyponymy, which is related to the idea 
of membership in a concept. Hyponymy is a stricter notion than membership, 
however, and therefore we need to generalise this model. Other relationships are 
typicality and meronymy, where a concept forms part of another concept, such 
as ‘finger’ to ‘hand’. 


We will further investigate evidence for these type of phenomena in text 
corpora, and different ways of modelling adjectives. Given a large enough number 
of dimensions, it is possible to create a matrix for an adjective that can exactly 
recreate the meaning vector for an adjective-noun combination. It would be 
interesting to look at the existence of this type of phenomena when the context, 
as expressed by a choice of basis vectors, is specified, rather than being, for 
example, the most commonly used words in the corpus. 


Another area for re search is the inter action of ambiguous concepts. These 


ideas are investigated in iBruza et al.l 2013l| . in which the authors elicit similarity 


judgements on novel combinations of ambiguous words such as ‘apple chip’ - 
here, ‘apple’ could be be interpreted as either a fruit or a computer brand, and 
‘chip’ as either food or hardware. The authors define non-compositionality as the 
failure of the interpretation of the combination of two concepts such as ‘apple 
chip’ to be modelled as a joint probability distribution over the interpretations 
of the two constituent concepts. Within the compositional distributional model 
of meaning that we have described, ambiguity in word meanings can be modelled 
by the use of density matrices, and this ambiguity interac ts with other words in 
the sentence which may serve to disambiguate the word Piedeleu et ah . 20151 . 


It would be interesting to model the phenomena found in iBruza et al 
within this framework. 


201.11 


Finally, the role of the verb ‘to be’ should be investigated. This verb should 
have a functional role, as do relative pronouns, as well as a distributionally 
determined meaning. 
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